You may click on the following links to fast forward to the codes directly producing the Figures:
Back to index page.
This vignette illustrates the label transfer strategy with which the transcriptom states were evaluated for AAV transduced / untransduced / LPS-primed microglia in the original publication (Lin et al., 2022). The strategy is largely inspired by Seurat Multimodal reference mapping vignette and Mapping and annotating query datasets vignette from Satija group.
The input files of this vignette are processed h5Seurat objects, produced by the preprocessing vignettes (see index page). The output is a Seurat object merging both query and reference data in the .h5Seurat format.
Upstream analysis pipelines (i.e. sequence alignment, generation of digital expression matrix, etc.):
cellranger count command and then aggregated using the cellranger aggr command.Downstream analysis:
scater)Seurat)Seurat)Load reuiqred libraries.
suppressPackageStartupMessages({
library(Seurat)
library(SeuratDisk)
library(tidyverse)
library(RColorBrewer)
})
Load the Seurat objects to workspace.
# Preprocessed, QCed Smart-seq2 microglia (Query dataset)
querydata <- LoadH5Seurat('../data/AAV_MG_scRNAseq/query_clean.h5Seurat', verbose = F)
# Preprocessed, QCed 10X (Reference dataset)
refdata <- LoadH5Seurat(file = '../data/AAV_MG_scRNAseq/refdata_clean.h5Seurat', verbose = F)
DimPlot(querydata, group.by = "treatment") | DimPlot(refdata, group.by = "treat")
Find anchors between reference and query.
dims_use <- 1:45
anchors <- FindTransferAnchors(
reference = refdata,
query = querydata,
k.filter = 50,
normalization.method = "LogNormalize",
reference.reduction = "pca",
dims = dims_use
)
Project query onto the reference UMAP structure.
refdata <- RunUMAP(refdata, dims = dims_use, reduction.name = "umap", seed.use = 42, return.model = TRUE)
query.transferred <- MapQuery(
anchorset = anchors,
query = querydata,
reference = refdata,
refdata = list(
state = "state",
treat = "treat"
),
reference.reduction = "pca",
reduction.model = "umap"
)
According to Stuart et al.(2019)1, queried cells with predictions score lower than 0.8 are likely to be incorrect. Here, we see a small fraction of cells with this low prediction scores. For the purpose of illustration, we filter out these cells. Once this is done, the reference and query objects are merged into a single object called refquery.
query.transferred <- SetIdent(query.transferred, cells = WhichCells(query.transferred, expression = predicted.treat.score < 0.8), value = "low")
query.transferred <- SetIdent(query.transferred, cells = WhichCells(query.transferred, expression = predicted.treat.score >= 0.8), value = "high")
query.transferred$score <- Idents(query.transferred)
query.transferred[[c("predicted.treat.score","score")]] %>%
ggplot(aes(x = predicted.treat.score, fill = score)) +
geom_histogram(bins = 20, color = "#333333") +
coord_flip() +
theme_light()
query.transferred.filtered <- subset(query.transferred, subset = predicted.treat.score > 0.8)
query.transferred.filtered$score <- NULL
## Merge objects
refquery <- merge(refdata, query.transferred.filtered)
According to the Multimodal reference mapping vignette, query cells directly mapping to the reference-derived UMAP will project to the most similar cells in the reference, which can potentially mask the presence of query cell types or cell states not captured in the reference. To account for these caveats, it is important to compute a ‘de novo’ UMAP visualization on the merged dataset during biological interpretation. Here we perform the computation and show that microglia from our query does not develop novel states which are absent in the reference.
refquery[['pca']] <- merge(refdata[['pca']], query.transferred.filtered[['ref.pca']])
refquery <- RunUMAP(refquery, reduction = "pca", dims = 1:45, n.epochs = 600)
DimPlot(refquery, group.by = "id") | DimPlot(refquery, group.by = "treat")
aav <- WhichCells(query.transferred, expression = treatment %in% c("AAV-MG1.2","AAV-MG1.1"))
lps <- WhichCells(query.transferred, expression = treatment == "LPS")
Idents(refquery) <- "treat"
refquery <- SetIdent(refquery, cells = aav, value = "AAV transduction")
refquery <- SetIdent(refquery, cells = lps, value = "LPS (Query)")
refquery[["treat"]] <- Idents(refquery)
Idents(refquery) <- "state"
hom_pred <- WhichCells(query.transferred, expression = predicted.state == "Homeostatic")
reac_pred <- WhichCells(query.transferred, expression = predicted.state == "Reactive")
refquery <- SetIdent(refquery, cells = hom_pred, value = "Homeostatic (Predicted)")
refquery <- SetIdent(refquery, cells = reac_pred, value = "Reactive (Predicted)")
refquery[["state"]] <- Idents(refquery)
transduced <- WhichCells(querydata, expression = mScarlet > 0)
non_transduced <- WhichCells(querydata, expression = mScarlet == 0)
q_lps <- WhichCells(query.transferred.filtered, expression = treatment == "LPS")
Idents(refquery) <- "treat"
refquery <- SetIdent(refquery, cells = transduced, value = "Transduced")
refquery <- SetIdent(refquery, cells = non_transduced, value = "Untransduced")
refquery <- SetIdent(refquery, cells = q_lps, value = "LPS(Query)")
refquery[["ident"]] <- Idents(refquery)
Idents(query.transferred.filtered) <- "transduction"
query.transferred.filtered <- SetIdent(query.transferred.filtered, cells = q_lps, value = "LPS(Query)")
query.transferred.filtered[["ident"]] <- Idents(query.transferred.filtered)
query.transferred.filtered$ident <- as.character(query.transferred.filtered$ident)
refquery$id <- factor(refquery$id, levels = c("Reference","Query"))
as.data.frame(refquery[[c("nCount_RNA","nFeature_RNA","id")]]) %>%
ggplot(mapping = aes(x=id, y=nCount_RNA)) +
geom_violin() +
geom_jitter(aes(color = id), size = 0.3, height = 0, width = 0.3) +
scale_x_discrete(labels = c("Reference\n(10x)","Query\n(Smart-seq2)")) +
scale_y_continuous(expand = c(0.1,0.1)) +
xlab(NULL) +
ylab("UMI counts") +
theme_linedraw() +
theme(
legend.position = "none",
axis.text.x = element_text(size = 16),
axis.text.y = element_text(size = 14),
axis.title.y = element_text(size = 18)
)
as.data.frame(refquery[[c("nFeature_RNA","id")]]) %>%
ggplot(mapping = aes(x=id, y=nFeature_RNA)) +
geom_violin() +
geom_jitter(aes(color = id), size = 0.3, height = 0, width = 0.3) +
scale_x_discrete(labels = c("Reference\n(10x)","Query\n(Smart-seq2)")) +
scale_y_continuous(expand = c(0.1,0.1)) +
xlab(NULL) +
ylab("Genes Detected") +
theme_linedraw() +
theme(
legend.position = "none",
axis.text.x = element_text(size = 16),
axis.text.y = element_text(size = 14),
axis.title.y = element_text(size = 18)
)
Identify differentially expression genes (DEGs).
Idents(refdata) <- "treat"
mks.refdata.all <- FindAllMarkers(refdata, only.pos = T)
Idents(query.transferred.filtered) <- "group"
mks.querydata.all <- FindAllMarkers(query.transferred.filtered, only.pos = T)
Parse DEGs.
mkset.1 <- intersect(mks.refdata.all[mks.refdata.all$cluster == "Saline",]$gene,
mks.querydata.all[mks.querydata.all$cluster == "AAV",]$gene)
mkset.2 <- mks.refdata.all[mks.refdata.all$cluster == "Poly(i:c)",]$gene
mkset.3 <- intersect(mks.refdata.all[mks.refdata.all$cluster %in% c("LPS","Poly(i:c)"),]$gene,
mks.querydata.all[mks.querydata.all$cluster == "LPS",]$gene)
Preparing for heatmap plot.
refdata$treat <- factor(refdata$treat, levels = c("Saline","Poly(i:c)","LPS"))
refdata <- ScaleData(refdata, features = c(mkset.1, mkset.2, mkset.3, VariableFeatures(refdata)))
Heatmap of DEGs in Saline, Poly(i:c), LPS groups in reference dataset.
## Downsample for demo
DoHeatmap(subset(refdata, downsample = 500),
features = c(mkset.1, mkset.2, mkset.3),
group.by = "treat",
group.colors = brewer.pal(n = 12, name = "Paired")[c(1,3,5)]) +
scale_fill_distiller(palette = "RdBu") +
theme(
axis.text.y = element_blank()
)
## Extremely slow, could crash Rstudio. Only run on high performance work-stations or clusters.
# DoHeatmap(refdata,
# features = c(mkset.1, mkset.2, mkset.3),
# group.by = "treat",
# group.colors = brewer.pal(n = 12, name = "Paired")[c(1,3,5)],
# label = FALSE) +
# scale_fill_distiller(palette = "RdBu") +
# theme(
# axis.text.y = element_blank()
# ) +
# NoLegend()
Idents(refdata) <- "treat"
features_hom = c("P2ry12","Cx3cr1","Sall1")
features_hom_lab = c( "Csf1r")
vplots = list()
for (feature in features_hom){
vplots[[feature]] <- FetchData(refdata, vars = c("treat", feature)) %>%
ggplot(aes_string(x = "treat", y = feature)) +
geom_violin(aes_string(fill = "treat")) +
geom_jitter(aes_string(color = "treat"), size = 0.3, alpha = 0.2, width = 0.25) +
scale_color_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,3,5)]) +
scale_fill_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,3,5)]) +
theme_classic() +
theme(
legend.position = "none",
axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.title.y = element_text(size = 16, family = "Arial"),
axis.text.y = element_text(size = 12, family = "Arial")
)
}
for (feature in features_hom_lab){
vplots[[feature]] <- FetchData(refdata, vars = c("treat", feature)) %>%
ggplot(aes_string(x = "treat", y = feature)) +
geom_violin(aes_string(fill = "treat")) +
geom_jitter(aes_string(color = "treat"), size = 0.3, alpha = 0.2, width = 0.25) +
scale_color_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,3,5)]) +
scale_fill_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,3,5)]) +
theme_classic() +
theme(
legend.position = "none",
axis.title.x = element_blank(),
axis.text.x = element_text(size = 15, family = "Arial"),
axis.title.y = element_text(size = 16, family = "Arial"),
axis.text.y = element_text(size = 12, family = "Arial")
)
}
vplots[["P2ry12"]] / vplots[["Cx3cr1"]] / vplots[["Sall1"]] / vplots[["Csf1r"]]
features_rea = c("Il1b","Spp1","Ifitm3")
features_rea_lab = c("Ccl5")
vplots = list()
for (feature in features_rea){
vplots[[feature]] <- FetchData(refdata, vars = c("treat", feature)) %>%
ggplot(aes_string(x = "treat", y = feature)) +
geom_violin(aes_string(fill = "treat")) +
geom_jitter(aes_string(color = "treat"), size = 0.3, alpha = 0.2, width = 0.25) +
scale_color_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,3,5)]) +
scale_fill_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,3,5)]) +
theme_classic() +
theme(
legend.position = "none",
axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.title.y = element_text(size = 16, family = "Arial"),
axis.text.y = element_text(size = 12, family = "Arial")
)
}
for (feature in features_rea_lab){
vplots[[feature]] <- FetchData(refdata, vars = c("treat", feature)) %>%
ggplot(aes_string(x = "treat", y = feature)) +
geom_violin(aes_string(fill = "treat")) +
geom_jitter(aes_string(color = "treat"), size = 0.3, alpha = 0.2, width = 0.25) +
scale_color_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,3,5)]) +
scale_fill_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,3,5)]) +
theme_classic() +
theme(
legend.position = "none",
axis.title.x = element_blank(),
axis.text.x = element_text(size = 15, family = "Arial"),
axis.title.y = element_text(size = 16, family = "Arial"),
axis.text.y = element_text(size = 12, family = "Arial")
)
}
vplots[["Il1b"]] / vplots[["Spp1"]] / vplots[["Ifitm3"]] / vplots[["Ccl5"]]
Preparing for heatmap plot.
query.transferred.filtered$ident <- factor(query.transferred.filtered$ident, levels = c("Untransduced","Transduced","LPS(Query)"))
query.transferred.filtered <- ScaleData(query.transferred.filtered, features = c(mkset.1, mkset.2, mkset.3, VariableFeatures(query.transferred.filtered)))
Heatmap of DEGs in AAV, LPS groups in query dataset.
## Demo
DoHeatmap(query.transferred.filtered,
features = c(mkset.1,mkset.3),
group.by = "ident",
group.colors = brewer.pal(n = 12, name = "Paired")[c(1,8,6)]) +
theme(
axis.text.y = element_blank()
) +
scale_fill_distiller(palette = "PiYG")
# DoHeatmap(query.transferred.filtered,
# features = c(mkset.1,mkset.3),
# group.by = "ident",
# group.colors = brewer.pal(n = 12, name = "Paired")[c(1,8,6)],
# label = FALSE) +
# theme(
# axis.text.y = element_blank()
# ) +
# scale_fill_distiller(palette = "PiYG") +
# NoLegend()
Idents(query.transferred.filtered) <- "ident"
features_hom = c("P2ry12","Cx3cr1","Sall1")
features_hom_lab = c( "Csf1r")
vplots = list()
for (feature in features_hom){
vplots[[feature]] <- FetchData(query.transferred.filtered, vars = c("treatment", feature)) %>%
ggplot(aes_string(x = "treatment", y = feature)) +
geom_jitter(aes_string(color = "treatment"), size = 0.6, alpha = 1, width = 0.25) +
geom_violin(aes_string(fill = "treatment"), alpha = 0.6) +
scale_color_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,7,5)]) +
scale_fill_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,7,5)]) +
theme_classic() +
theme(
legend.position = "none",
axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.title.y = element_text(size = 16, family = "Arial"),
axis.text.y = element_text(size = 12, family = "Arial")
)
}
for (feature in features_hom_lab){
vplots[[feature]] <- FetchData(query.transferred.filtered, vars = c("treatment", feature)) %>%
ggplot(aes_string(x = "treatment", y = feature)) +
geom_jitter(aes_string(color = "treatment"), size = 0.6, alpha = 1, width = 0.25) +
geom_violin(aes_string(fill = "treatment"), alpha = 0.6) +
scale_color_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,7,5)]) +
scale_fill_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,7,5)]) +
scale_x_discrete(labels = c("Untransd.","Transd.","LPS")) +
theme_classic() +
theme(
legend.position = "none",
axis.title.x = element_blank(),
axis.text.x = element_text(size = 15, family = "Arial"),
axis.title.y = element_text(size = 16, family = "Arial"),
axis.text.y = element_text(size = 12, family = "Arial")
)
}
vplots[["P2ry12"]] / vplots[["Cx3cr1"]] / vplots[["Sall1"]] / vplots[["Csf1r"]]
features_rea = c("Il1b","Spp1","Ifitm3")
features_rea_lab = c("Ccl5")
vplots = list()
for (feature in features_rea){
vplots[[feature]] <- FetchData(query.transferred.filtered, vars = c("treatment", feature)) %>%
ggplot(aes_string(x = "treatment", y = feature)) +
geom_jitter(aes_string(color = "treatment"), size = 0.6, alpha = 1, width = 0.25) +
geom_violin(aes_string(fill = "treatment"), alpha = 0.6) +
scale_color_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,7,5)]) +
scale_fill_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,7,5)]) +
theme_classic() +
theme(
legend.position = "none",
axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.title.y = element_text(size = 16, family = "Arial"),
axis.text.y = element_text(size = 12)
)
}
for (feature in features_rea_lab){
vplots[[feature]] <- FetchData(query.transferred.filtered, vars = c("treatment", feature)) %>%
ggplot(aes_string(x = "treatment", y = feature)) +
geom_jitter(aes_string(color = "treatment"), size = 0.6, alpha = 1, width = 0.25) +
geom_violin(aes_string(fill = "treatment"), alpha = 0.6) +
scale_color_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,7,5)]) +
scale_fill_manual(values = brewer.pal(n = 12, name = "Paired")[c(1,7,5)]) +
scale_x_discrete(labels = c("Untransd.","Transd.","LPS")) +
theme_classic() +
theme(
legend.position = "none",
axis.title.x = element_blank(),
axis.text.x = element_text(size = 15, family = "Arial"),
axis.title.y = element_text(size = 16, family = "Arial"),
axis.text.y = element_text(size = 12, family = "Arial")
)
}
vplots[["Il1b"]] / vplots[["Spp1"]] / vplots[["Ifitm3"]] / vplots[["Ccl5"]]
Backup the processed Seurat object using h5 format.
## Avoid using factor while saving to other formats
refquery$treat <- as.character(refquery$treat)
refquery$state <- as.character(refquery$state)
refquery$ident <- as.character(refquery$ident)
if(!dir.exists('../data/AAV_MG_scRNAseq')){dir.create('../data/AAV_MG_scRNAseq')}
SaveH5Seurat(refquery, filename = '../data/AAV_MG_scRNAseq/refquery.h5Seurat', overwrite = T, verbose = F)
SaveH5Seurat(query.transferred.filtered, filename = '../data/AAV_MG_scRNAseq/query_transferred.h5Seurat', overwrite = T, verbose = F)
Convert('../data/AAV_MG_scRNAseq/query_transferred.h5Seurat', dest = "h5ad", overwrite = T, verbose = F)
sessionInfo()
## R version 4.0.5 (2021-03-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.6 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=zh_CN.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=zh_CN.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=zh_CN.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] RColorBrewer_1.1-3 forcats_0.5.1 stringr_1.4.0
## [4] dplyr_1.0.8 purrr_0.3.4 readr_2.1.2
## [7] tidyr_1.2.0 tibble_3.1.6 ggplot2_3.3.5
## [10] tidyverse_1.3.1 SeuratDisk_0.0.0.9019 SeuratObject_4.0.4
## [13] Seurat_4.1.0
##
## loaded via a namespace (and not attached):
## [1] readxl_1.4.0 backports_1.4.1 plyr_1.8.7
## [4] igraph_1.3.0 lazyeval_0.2.2 splines_4.0.5
## [7] listenv_0.8.0 scattermore_0.8 digest_0.6.29
## [10] htmltools_0.5.2 fansi_1.0.3 magrittr_2.0.3
## [13] tensor_1.5 cluster_2.1.3 ROCR_1.0-11
## [16] limma_3.46.0 tzdb_0.3.0 globals_0.14.0
## [19] modelr_0.1.8 matrixStats_0.61.0 spatstat.sparse_2.1-0
## [22] colorspace_2.0-3 rvest_1.0.2 ggrepel_0.9.1
## [25] haven_2.4.3 xfun_0.30 crayon_1.5.1
## [28] jsonlite_1.8.0 spatstat.data_2.1-4 survival_3.3-1
## [31] zoo_1.8-9 glue_1.6.2 polyclip_1.10-0
## [34] gtable_0.3.0 leiden_0.3.9 future.apply_1.8.1
## [37] abind_1.4-5 scales_1.1.1 DBI_1.1.2
## [40] spatstat.random_2.2-0 miniUI_0.1.1.1 Rcpp_1.0.8.3
## [43] viridisLite_0.4.0 xtable_1.8-4 reticulate_1.24
## [46] spatstat.core_2.4-2 bit_4.0.4 htmlwidgets_1.5.4
## [49] httr_1.4.2 ellipsis_0.3.2 ica_1.0-2
## [52] farver_2.1.0 pkgconfig_2.0.3 sass_0.4.1
## [55] uwot_0.1.11 dbplyr_2.1.1 deldir_1.0-6
## [58] utf8_1.2.2 labeling_0.4.2 tidyselect_1.1.2
## [61] rlang_1.0.2 reshape2_1.4.4 later_1.3.0
## [64] munsell_0.5.0 cellranger_1.1.0 tools_4.0.5
## [67] cli_3.2.0 generics_0.1.2 broom_0.7.12
## [70] ggridges_0.5.3 evaluate_0.15 fastmap_1.1.0
## [73] yaml_2.3.5 goftest_1.2-3 knitr_1.38
## [76] bit64_4.0.5 fs_1.5.2 fitdistrplus_1.1-8
## [79] RANN_2.6.1 pbapply_1.5-0 future_1.24.0
## [82] nlme_3.1-157 mime_0.12 xml2_1.3.3
## [85] hdf5r_1.3.5 compiler_4.0.5 rstudioapi_0.13
## [88] plotly_4.10.0 png_0.1-7 spatstat.utils_2.3-0
## [91] reprex_2.0.1 bslib_0.3.1 stringi_1.7.6
## [94] highr_0.9 RSpectra_0.16-0 lattice_0.20-45
## [97] Matrix_1.4-1 vctrs_0.4.0 pillar_1.7.0
## [100] lifecycle_1.0.1 spatstat.geom_2.4-0 lmtest_0.9-40
## [103] jquerylib_0.1.4 RcppAnnoy_0.0.19 data.table_1.14.2
## [106] cowplot_1.1.1 irlba_2.3.5 httpuv_1.6.5
## [109] patchwork_1.1.1 R6_2.5.1 promises_1.2.0.1
## [112] KernSmooth_2.23-20 gridExtra_2.3 parallelly_1.31.0
## [115] codetools_0.2-18 MASS_7.3-56 assertthat_0.2.1
## [118] withr_2.5.0 sctransform_0.3.3 mgcv_1.8-40
## [121] parallel_4.0.5 hms_1.1.1 grid_4.0.5
## [124] rpart_4.1-15 rmarkdown_2.13 Rtsne_0.15
## [127] shiny_1.7.1 lubridate_1.8.0
Stuart, T. et al. Comprehensive Integration of Single-Cell Data. Cell 177, 1888-1902.e21 (2019).↩︎